gss: A Package for Smoothing Spline ANOVAModels

نویسنده

  • Chong Gu
چکیده

This document provides a brief introduction to the gss facilities for nonparametric statistical modeling in a variety of problem settings including regression, density estimation, and hazard estimation. Functional ANOVA decompositions are built into models on product domains, and modeling and inferential tools are provided for tasks such as interval estimates, the “testing” of negligible model terms, the handling of correlated data, etc. The methodological background is outlined, and data analysis is illustrated using real-data examples. Nonparametric function estimation using stochastic data, also known as smoothing, has been studied by generations of statisticians. While scores of methods have proved successful for univariate smoothing, ones practical in multivariate settings number far less. Smoothing spline ANOVAmodels are a versatile family of smoothing methods that are suitable for both univariate and multivariate problems. The first public release of gss dated back to 1999, when the total number of R packages on CRAN was in dozens. The package was originally designed as a front end to RAPACK (Gu, 1989), a collection of RATFOR routines for Gaussian regression. Over the years, new functionalities have been added, numerical efficiency improved, the user-interface refined, and gss has now matured into a comprehensive package that can be used in a great variety of problem settings. As active development tapers off, gss is likely to remain stable in the foreseeable future, and it is time to compile an introductory document for the current version of the package. Technical discussions can not be avoided for the exposition of the key methodologyical ingredients, but attempts have been made to keep those lucid if not rigorous. Suites of R functions for regression, density estimation, and hazard estimation are introduced with limited illustrations. Model configurations via explicit and implicit means are described, and extra features such as semiparametric models and mixed-effect models are discussed. A trio of real-data examples are also presented to demonstrate data analysis using gss facilities. A treatise on the methodology can be found in the forthcoming second edition of Gu (2002), with systematic developments and many more software illustrations. A word search for “spline” on the CRAN page of contributed packages, at the end of 2011, highlighted about twenty entries with the word appearing in the package names or in the short descriptions, and these are on top of splines and mgcv in the standard library; splines provides facilities for the handling of univariate basis functions such as the B-splines (de Boor, 1978), and most of the other contributed packages concern some niche applications of the smoothing spline or regression spline techniques. More comprehensive in scope are mgcv by Wood, which mainly fits additive regression models using penalized regression splines, and assist by Wang and Ke, which features mixed-effect models in regression settings and employs the legacy RKPACK routines for numerical computation. The other packages may do things better here and there, but none seems to be as comprehensive as gss. Methodological background Observing Yi ∼N (η(xi),σ), i= 1, . . . ,n on xi ∈ [0,1], one may estimate η(x) via the minimization of a penalized least squares functional, 1 n n ∑ i=1 ( Yi − η(xi) 2 + λ ∫ 1 0 ( η′′(x) 2 dx. (1) The minimization takes place in a function space

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of Mortality Data using Smoothing Spline Poisson Regression

We study a smoothing spline Poisson regression model for the analysis of mortality data. Being a non-parametric approach it is intrinsically robust, that it is a penalized likelihood estimation method makes available an approximate Bayesian confidence interval and importantly the software gss, its implementation on the freely available statistical package R, makes it easily accessible to the us...

متن کامل

GAMs with integrated model selection using penalized regression splines and applications to environmental modelling

Generalized Additive Models (GAMs) have been popularized by the work of Hastie and Tibshirani (1990) and the availability of user friendly GAM software in Splus. However, whilst it is flexible and efficient, the GAM framework based on backfitting with linear smoothers presents some difficulties when it comes to model selection and inference. On the other hand, the mathematically elegant work of...

متن کامل

Use of Two Smoothing Parameters in Penalized Spline Estimator for Bi-variate Predictor Non-parametric Regression Model

Penalized spline criteria involve the function of goodness of fit and penalty, which in the penalty function contains smoothing parameters. It serves to control the smoothness of the curve that works simultaneously with point knots and spline degree. The regression function with two predictors in the non-parametric model will have two different non-parametric regression functions. Therefore, we...

متن کامل

The crs Package

This vignette outlines the implementation of the regression spline method contained in the R crs package, and also presents a few illustrative examples.

متن کامل

The crs Package: Nonparametric Regression Splines for Continuous and Categorical Predictors

A new package crs is introduced for computing nonparametric regression (and quantile) splines in the presence of both continuous and categorical predictors. B-splines are employed in the regression model for the continuous predictors and kernel weighting is employed for the categorical predictors. We also develop a simple R interface to NOMAD, which is a mixed integer optimization solver used t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012